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1 Introduction 

Natural Language Generation (NLG) systems generate texts in English (or other human lan- 
guages, such as French) from computer-accessible data. NLG systems are (currently) most often 
used to help human authors write routine documents, including business letters [pBW91 | and 



weather reports |GDK94]. They also have been used as interactive explanation tools which 



communicate information in an understandable way to non-expert users, especially in software 
engineering (eg, pX9^ ) and medical (eg, [pMF+94| ]) contexts. 



From a technical perspective, almost all applied NLG systems perform the following three 



tasks [plei94|]: 



Content Determination and Text Planning: Decide what information should be commu- 
nicated to the user (content determination), and how this information should be rhetori- 
cally structured (text planning). These tasks are usually done simultaneously. 

Sentence Planning: Decide how the information will be split among individual sentences and 
paragraphs, and what cohesion devices (eg, pronouns, discourse markers) should be added 
to make the text flow smoothly. 

Realization: Generate the individual sentences in a grammatically correct manner. 

In the rest of this paper, I shall briefly discuss different ways of performing each of these tasks 



2 Content Determination and Text Planning 

Content determination (deciding what information to communicate in the text) and text-planning 
(organizing the information into a rhetorically coherent structure) are done simultaneously in 
most applied NLG systems |Rei94 1 . These tasks can be done at many different levels of sophis- 



tication. One of the simplest (and most common) approaches is simply to write a 'hard-coded' 
content/text-planner in a standard programming language (C-|--|-, Lisp, etc). The resultant 
system may lack flexibility, but if the texts being produced have a standardized content and 
structure (which is true in many technical domains), then this can be the most effective way to 
perform these tasks. 

On the other end of the sophistication spectrum, many standard AI techniques have been 
adopted for content determination and text planning, including rule-based systems |RML95| and 



planning | Hov88| ]. Systems built in this way are in principal very flexible and powerful, although 



in practice they have sometimes not been robust enough for real-world use. 

An intermediate approach which has been quite popular is to use a special 'schema' or 'text- 
planning' language [ |McK85 , [KKR91 |. Such languages typically allow the developer to represent 



1 



text plans as transition networks of one sort or another, with the nodes giving the information 
content and the arcs giving the rhetorical structure In many cases text-planning languages 
are implemented as macro packages, which gives the developer access to the full power of the 
underlying programming language whenever necessary. 

3 Sentence Planning 

Sentence planning includes 

• Conjunction and other aggregation. For example, transforming (1) into (2): 

1) Sam has high blood pressure. Sam has low blood sugar. 

2) Sam has high blood pressure and low blood sugar. 

• Pronominalization and other reference. For example, transforming (3) into (4): 

3) I just saw Mrs. Black. Mrs Black has a high temperature. 

4) I just saw Mrs. Black. She has a high temperature. 

• Introducing discourse markers. For example, transforming (5) into (6): 

5) If Sam goes to the hospital, he should go to the store. 

6) If Sam goes to the hospital, he should also go to the store. 

The common theme behind these operations is they do not change the information content of 
the text, but they do make it more fluent and easily readable. 

Sentence planning is important if the text needs to read fluently and, in particular, if it should 
look like it was written by a human (which is usually the case for business letters, for example). 
If it doesn't matter if the text sounds stilted and was obviously produced by a computer, then 
it may be possible to de-emphasize sentence planning, and perform minimal aggregation, use no 
pronouns, etc. 

If the text does need to look fluent, then a good job of sentence planning is essential. There 
are formal models of all of the operations mentioned above, and some applied NLG systems 



have incorporated them, eg, [MKS94]. It is also possible to do effective sentence-planning in an 
ad-hoc manner, at least in a limited domain; Knowledge Point's Performance Now system is a 
good example of this. 



4 Realization 



A Realizer generates individual sentences (typically from a 'deep syntactic' representation | Rei94 1 ) . 
The realizer needs to make sure that the rules of English are obeyed, including 

• Point absorption and other punctuation rules. For example, the sentence / saw Helen 
Jones, my sister-in-law should end in not 

• Morphology. For example, the plural of box is boxes, not boxs. 

• Agreement. For example, / am here instead of / is here. 

• Reflexives. For example, John saw himself, instead of John saw John. 

There are numerous linguistic formalisms and theories which can be incorporated into an 
NLG Realizer, far too many to describe here. There are also some general-purpose 'engines' 
which can be programmed with various linguistic rules, such as FUF [ Elh92 | and PENMAN 
|Pen8|. 



In many cases, acceptable performance can be achieved without using complex linguistic 
modules. In particular, if only a few different types of sentences are being generated, then it 
may be simpler and cheaper to use fill-in-the-blank templates for realization, instead of 'proper' 
syntactic processing. 



5 Conclusion 



Many different techniques are available for performing the three NLG tasks of content determi- 
nation (and text planning), sentence planning, and realization. These techniques range from the 
simplistic to the extremely sophisticated, and it is impossible to say that one is always 'better' 
than another. It all depends on the characteristics of the application, such as whether extensive 
filtering and summarization of information is needed, whether texts need to 'look like they were 
written by a person', and how much syntactic variety is expected to occur in the generated texts. 
A good NLG engineer will choose the most appropriate set of techniques, given the needs of the 



application | RM93 | and the available resources. 
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